Accelerated loading of guest virtual machine from live snapshot

ABSTRACT

Loading a guest virtual machine from a snapshot includes determining a plurality of executable modules loaded in a guest operating system. Hash values for pages of guest physical memory in the snapshot file are determined. Hash values for pages of the executable modules executing in the guest operating system are determined. Matches to the pages in the guest physical memory and the pages of the executable modules are searched for using the hash values. Context information associated with the matching pages in the guest physical memory and the pages of the executable modules is written to the snapshot. The snapshot is modified to link the guest physical memory to the pages of the executable modules.

CROSS-REFERENCE TO RELATED APPLICATIONS

This Application claims priority to U.S. Provisional Patent ApplicationSer. No. 62/356,406, filed on Jun. 29, 2016, entitled “AcceleratedLoading Guest Virtual Machine from Live Snapshot,” which is incorporatedherein by reference in its entirety.

FIELD OF INVENTION

The present invention relates generally to computing systems, and moreparticularly, operating systems' memory management with loading a guestvirtual machine from a live snapshot.

BACKGROUND OF INVENTION

Address space layout randomization (ASLR) is a security featureavailable in Windows Vista™ and later versions of Windows®. ASLR isintended to prevent an attacker from predicting target virtual addressesof important data/functions. This feature overrides the default loadaddresses of executable modules specified in a Portable Executable (PE)image header, and sets a new random load address for every loadedmodule. The new load address of each module is valid until the nextrestart of the operating system. It also randomizes virtual memoryallocation requests from applications.

While ASLR can improve security, one disadvantage of ASLR is that,because of the randomization of load addresses, modules that couldotherwise be shared by host operating systems and guest operatingsystems are not able to be shared. This is because the randomization ofload addresses causes the modules to start at different virtualaddresses. Thus, while the modules may be the same on a disk, afterloading, they are at different virtual address and cannot be shared. Asa result, memory usage may be increased in systems that implement ASLR.

SUMMARY OF INVENTION

The present invention relates to a processor configured for loading aguest virtual machine from a live snapshot. In order to accelerateloading of the guest virtual machine in an address space layoutrandomization (ASLR) environment, a snapshot of the guest virtualmachine may be optimized for creating a live snapshot of the guestvirtual machine. The live snapshot can be applied to a suspended virtualmachine via running a resuming process.

In one embodiment of the present invention, a method is provided foroptimizing a guest virtual machine (VM) snapshot. The method foroptimizing may be comprised of two parts. A first part can includerunning guest modules in memory and sending them to a host, and a secondpart may include taking a snapshot of the VM and creating a live guestsnapshot file. In the beginning of the first part, information aboutexecutable modules running inside of a guest OS of the VM is identifiedand obtained in a guest OS. Subsequently, the processor may create adigest (e.g., hash, MD5, etc.) of every page of the guest physicalmemory in the snapshot file and a digest (e.g., hash, MD5, etc.) ofevery page of the executable module running in the guest OS and comparethe hashes for matching the pages of executable modules or memories. Thesecond part may include the storing a context information associatedwith the matches to the pages in the guest physical memory and the pagesof the executable modules to a guest snapshot file, and then modifyingthe snapshot file to link the guest physical memory to the pages of theexecutable modules. This method can provide for quick location of thecorresponding module and page during guest load time and the live guestOS snapshot can be obtained.

One embodiment of the method of the present invention can include thesteps of resuming a suspended guest VM from the optimized snapshot inthe ASLR environment. The processor may read context information of theguest OS modules, and map the executable modules containing matchingpages from the snapshot into a host process virtual memory. Therelocation can be performed using the context information stored in thelive guest snapshot.

In another embodiment of the present invention, a method is provided forloading a guest VM snapshot in the ASLR environment. The processor maydetermine whether the guest OS extracted a plurality of executablemodules from the guest OS. Subsequently, information of executablemodules running inside a guest OS of a VM may be identified and obtainedin a guest OS. The processor can create a memory hash for every page ofthe guest physical memory in the snapshot file and a module hash ofevery page of the executable modules running in the guest OS. Theprocessor can search the hashes for matching the pages of executablemodules or memories. Once the guest VM preparation steps are executed inthe processor, then the step of storing the context informationassociated with the matches to the pages in guest physical memory andthe pages of the executable modules to a guest snapshot file, and thestep of modifying the snapshot to link the guest physical memory to thepages of the executable modules ay be undertaken. Subsequently, theprocessor may read the context information of the guest OS modules, andmap the executable modules containing matching pages from the snapshotinto the host process virtual memory. The relocation may be performedusing the context information stored in the live guest snapshot insteadof a portable executable (PE) header section. The processor may checkthe first hash value of the relocated matching page with the hash valuestored in the context information.

Another embodiment of the present invention provides a non-transitorycomputer readable storage medium having a program configured for loadingthe guest VM in the ASLR environment. For accelerating the loading ofthe guest VM, the program is executed by the processor for determiningwhether the guest OS extracted a plurality of executable module from theguest OS of VM. Subsequently, information of the executable modulesrunning inside a guest OS of a VM is identified and obtained in a guestOS. The program may be executed to create a memory hash for every pageof the guest physical memory in the snapshot file and a module hash ofevery page of the executable modules running in the guest OS, and thensearch the hashes for matching the pages of executable modules ormemories. The program may be configured for storing the contextinformation associated with the guest physical memory and the pages ofthe executable modules, and then modifying the snapshot to link theguest physical memory to the pages of the executable modules.Subsequently, the program can be enabled to read the context informationand perform a mapping of the executable modules with matching pages fromthe snapshot into the host process virtual memory. The relocation isperformed by using the context information stored in the live guestsnapshot. The program can continuously check the hashes for matching thepages of the module and memory. Through the method, the module canchange or update on the host OS, then the original physical memory pagecoming from the guest snapshot.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the disclosure, reference may be made tothe accompanying drawings in which:

FIG. 1 is a block diagram illustrating example memory layouts in ASLRenvironments following operating system boots in accordance with oneembodiment of the present invention;

FIG. 2 is a block diagram illustrating a module relocation in ASLRenvironments in accordance with one embodiment of the present invention;

FIG. 3 is a block diagram illustrating a module relocation in ASLRenvironments using a relocation section in an executable image inaccordance with one embodiment of the present invention;

FIG. 4 is a block diagram illustrating virtual machines in an ASLRenvironment in accordance with one embodiment of the present invention;

FIG. 5 is a flow chart illustrating a method for optimizing a guestvirtual machine snapshot in an ASLR environment in accordance with oneembodiment of the present invention;

FIG. 6 is a flow chart illustrating a method for resuming a suspendedguest virtual machine from an optimized snapshot in an ASLR environmentin accordance with one embodiment of the present invention; and

FIG. 7 is a block diagram of an example embodiment of a computer systemupon which embodiments of the inventive subject matter can execute inaccordance with one embodiment of the present invention.

DETAILED DESCRIPTION

In the following detailed description of example embodiments of theinvention, reference is made to the accompanying drawings that form apart hereof, and in which is shown by way of illustration specificexample embodiments in which the invention may be practiced. Theseembodiments are described in sufficient detail to enable those skilledin the art to practice the inventive subject matter, and it is to beunderstood that other embodiments may be utilized and that logical,mechanical, electrical and other changes may be made without departingfrom the scope of the inventive subject matter.

Some portions of the detailed descriptions which follow are presented interms of algorithms and symbolic representations of operations on databits within a computer memory. These algorithmic descriptions andrepresentations are the ways used by those skilled in the dataprocessing arts to most effectively convey the substance of their workto others skilled in the art. An algorithm is here, and generally,conceived to be a self-consistent sequence of steps leading to a desiredresult. The steps are those requiring physical manipulations of physicalquantities. Usually, though not necessarily, these quantities take theform of electrical or magnetic signals capable of being stored,transferred, combined, compared, and otherwise manipulated. It hasproven convenient at times, principally for reasons of common usage, torefer to these signals as bits, values, elements, symbols, characters,terms, numbers, or the like. It should be borne in mind, however, thatall of these and similar terms are to be associated with the appropriatephysical quantities and are merely convenient labels applied to thesequantities. Unless specifically stated otherwise as apparent from thefollowing discussions, terms such as “processing” or “computing” or“calculating” or “determining” or “displaying” or the like, refer to theaction and processes of a computer system, or similar computing device,that manipulates and transforms data represented as physical (e.g.,electronic) quantities within the computer system's registers andmemories into other data similarly represented as physical quantitieswithin the computer system memories or registers or other suchinformation storage, transmission or display devices.

In the Figures, the same reference number is used throughout to refer toan identical component that appears in multiple Figures. Signals andconnections may be referred to by the same reference number or label,and the actual meaning will be clear from its use in the context of thedescription. In general, the first digit(s) of the reference number fora given item or part of the invention should correspond to the Figurenumber in which the item or part is first identified.

The description of the various embodiments is to be construed asexamples only and does not describe every possible instance of theinventive subject matter. Numerous alternatives could be implemented,using combinations of current or future technologies, which would stillfall within the scope of the claims. The following detailed descriptionis, therefore, not to be taken in a limiting sense, and the scope of theinventive subject matter is defined only by the appended claims.

In various aspects of the present invention, a guest snapshot is a savedstate of a virtual machine. A live guest snapshot is a saved state of arunning virtual machine. The saved state can contain the state ofvirtual hardware, including the content of virtual machine physicalmemory. A guest snapshot is typically saved to a file or set of filesassociated with a virtual machine.

When a virtual machine (VM) uses the same type and version of the hostoperating system (OS), there is a higher likelihood that modules orfiles can be found to be shared between the VM and the host OS. Analysisof shared memory blocks is performed on a saved state, or during thecreation of this saved state file. The results of the analysis can beused to determine which modules can be shared between a host OS and aVM.

FIG. 1 is a block diagram illustrating example memory layouts in ASLRenvironments following operating system boots. A process memory layoutfor a host operating system 100 typically contains tens of varioussystem executable modules 110 and 120 associated with given randomizedvirtual address every operating system restart. Each restart generatesnew virtual addresses for these modules 180 and 190. The same techniquealso applies to every guest operating system running under the controlof the host operating system. The relocation of modules across rebootsand guest operating systems can have negative consequences, becausealthough modules from both the host and guest operating system arebinary same on disk, they are loaded at different virtual addresses andthus cannot be shared across host and guest operating systems.

Some aspects of the present invention utilize the PE module. PE modulehas a preferred default load address (ImageBase) specified in its imageheader. This address is referred to as a default load address and itspecifies a preferred location in virtual memory of a process where themodule should reside after it is loaded. If the module is loaded intoits default load address, no relocation is necessary before moduleexecution. Otherwise the module must be relocated to a different virtualaddress before it can be executed. Those of skill in the art having thebenefit of the disclosure will appreciate that other types of executablemodules could be used and are within the scope of the inventive subjectmatter.

When a module is being loaded by an operating system module loader(e.g., Windows NT® module loader), the loader may determine a differentload address for the module. For example, a loader may load a module ata different load address from the default load address if another moduleor process heap is already occupying the module's default load address.

Module relocation is a process of modifying a module's executable code,so that it can be executed from the actual load address when the actualload address is different from the module's default load address. Modulerelocation is typically performed each time a module is loaded into anon-default load address.

FIG. 2 is a block diagram illustrating an example module relocation inASLR environments. In the example illustrated in FIG. 2, a module 200was linked to an executable file and its preferred default load addresswas set to 0x0000 by the linker Some code instructions have been alreadyprepared (by the linker) based on the predefined memory address. Forexample, the “JSUB #Routine” instruction 210, that causes execution tojump to a subroutine called #Routine, can be translated into thefollowing byte sequence: “4B 10 10 36”. The first two bytes, “4B 10,”identifies the instruction (JSUB) and next two bytes, “10 36,” representthe address where execution is to jump (0x1036). However, when themodule 200 is relocated to load address that is different from thedefault load address, these two bytes must be recalculated to reflectthe new load address. For example, the byte pattern “4B 10 10 36” isreplaced in memory with “4B 10 60 36” if the module is relocated fromits default load address to a 0x5000 load address.

FIG. 3 is a block diagram illustrating a module relocation in ASLRenvironments using a relocation section in an executable image. As notedabove, relocation can slow down module loading time and it can also havea negative memory impact, because relocated executable memory pages aretypically duplicated using a copy-on-write (COW) technique. A PE module300 may have a special section in its image header called “.RELOC” 310(referred to as a relocation section). The relocation section containsrelocation information for all module pages that must be relocatedbefore module execution. Module relocation can be skipped if the moduledefault load address and the module actual load address is the same.

FIG. 4 is a block diagram illustrating virtual machines in an ASLRenvironment. User memory of a physical computer 400 can be used for amain OS (also referred to as a host operating system), runningapplications and VMs. The number of VMs that can run on a computersystem depends on available free memory. In the example illustrated inFIG. 4, two VMs 410A and 410B are executing on the computer. Those ofskill in the art having the benefit of the disclosure will appreciatethat more or fewer VMs may execute on the computer. When the host OS isthe same as the guest OS in VMs 410A and 410B, many system modules aretypically binary the same on disk and eligible for use across all OSs.In the example illustrated in FIG. 4, every OS inside has active ASLRand its modules are located at different locations in memory. Each OSrestart also loads these modules again at different memory locations.For example, this prevents sharing of the “NTDLL” module 430 in memorybetween host OS and VMs 410A and 410B. As a consequence it brings largermemory footprint 440 for each running VM.

In some aspects of the present invention, in order to accelerate loadingof a guest virtual machine in an ASLR environment, a snapshot of a guestvirtual machine is taken and the snapshot is analyzed and optimized tocreate a live snapshot of the guest virtual machine. The optimized livesnapshot of the guest virtual machine can be used to resume a suspendedvirtual machine.

FIG. 5 is a flow chart illustrating a method for optimizing a guestvirtual machine snapshot in an ASLR environment. This phase consists oftwo parts 500 and 530. In a first part of the method, a VM is preparedfor use, information about currently running guest modules in memory canbe gathered and sent to a host 500. In a second part of the method, asnapshot of the VM is taken and a live guest snapshot file is created530.

At block 502, information about executable modules running inside aguest OS of a VM is obtained. In some aspects, all user mode modulesloaded in a guest OS can be enumerated by using OS routines. Inembodiments that are implemented within the Microsoft Windows family ofoperating systems, the OS routines CreateToolhelp32Snapshot,Process32First/Process32Next and Module32First/Module32Next APIs can beutilized to enumerate user mode modules. Loaded kernel mode modules canbe identified using a NtQuerySystemInformation (SystemModuleInformation)call. The previous two methods for module enumeration may not revealmodules not loaded in any process but still present (i.e., cached) inguest memory (i.e., standby pages), but it is possible to use theundocumented SuperFetch API (Windows Vista™ or later), namelyNtQuerySystemInformation (SystemSuperFetchInformation), to queryphysical memory pages attributes containing required information. Thoseof skill in the art having the benefit of the disclosure will appreciatethat equivalent functionality may be provided in other operatingsystems, and that use of such equivalent functionality is within thescope of the inventive subject matter.

For every identified executable module, the following information can beextracted from guest OS: a module file system path, which is expected tomatch the host OS; a module default load address, which is usuallydifferent from host OS because of ASLR. The resulting modulesinformation is then passed to host OS for further processing asdescribed below.

At block 504, physical memory regions of a guest OS are located in asnapshot file. A digest (e.g., hash; MD5) can be created of every pageof guest physical memory. The page hashes are used later to search formatching module pages.

At block 506, the modules information extracted from guest OS is thenused to create a hash of every page of each executable module running inthe guest OS. Each module is mapped into host process memory as anexecutable image using a guest file path (that is expected to be same inhost OS). If a module does not exist on a host, it is ignored. Themodule is then relocated to a guest default load address, but only ifthe host load address is different (e.g., through ASLR). In someaspects, this module relocation is performed as if the module is loadedat the default address. Conflicts in the use of the default load addressare not a concern for purposes of creating the hash, as the module willnot be executed. A hash is prepared for every page of a module loaded inthe guest OS.

At block 508, pages of guest physical memory are then compared withpages of executable modules from the previous operation (i.e., block506). In some aspects, only a comparison of hash values is performed forperformance reasons. Once a matching page (i.e., matching hash) pair isfound, a second level comparison is performed (e.g., a second hash or asimple comparison of page data to avoid possible race conditions).

At block 532, context information about the modules identified at block502 that contain at least one page identified in guest physical memoryis obtained. The information is then saved into the guest snapshot file.In some aspects, the context information includes: assigned moduleunique identifier (may be just an index); module file path; moduledefault load address from PE header; guest load base address. Also forevery relocated page of the module, the following is also saved into theguest snapshot file: assigned page module local unique identifier (aindex can be used); the relative virtual address (RVA) of the page;physical address of the guest matching physical memory page; relocationinformation for this page; and a hash of the page. The RVA is the offsetof the module page from the beginning of module (when mapped intomemory).

The resulting context information is serialized into the guest snapshotfile for later use during guest OS load time. Data structures or objectstate is translated into a format that can be stored into a file.

At block 534, the guest physical memory attributes are modified tocontain links to modules pages. For each matching physical page, thefollowing information can be persisted into page attributes: a special“page is linked to host module” flag is set for the page; a uniquemodule identifier; a module local unique page identifier. In someaspects, this can provide for quick location of the corresponding moduleand page during guest load time.

At this point, a live guest OS snapshot has been created.

FIG. 6 is a flow chart illustrating a method for resuming a suspendedguest virtual machine from an optimized snapshot in an ASLR environment.

At block 600, context information of guest OS modules, prepared in thesuspending process described above with respect to FIG. 5, is read.

At block 602, each identified executable module must be mapped into thehost process virtual memory as an executable image. This operation istypically very fast as the host OS is already using the executablemodule.

Blocks 604 and 606 are executed to preprocess every page of theidentified modules.

At block 604, page relocation of pages of the module is performed tomatch the guest load address. The relocation is performed using thecontext information stored in the live guest snapshot is used instead ofinformation in the “.RELOC” image header section. This is because the“.RELOC” information is likely paged out on the host OS and accessing itwould result in extra disk I/O.

At block 606, the page is hashed and the resulting hash value iscompared with the hash value from context information to ensure it isstill the very same page. For example, it is possible that the modulecan be already changed/updated on host OS. In this case the originalphysical memory page coming from guest snapshot is used instead.

For each guest physical memory page with “page is linked to host module”flag set, use unique module/page identifier saved in physical pageattributes process the module page prepared as described above withrespect to blocks 602-606 of the method.

In some embodiments, once the guest memory load is finished, all mappedexecutable modules are released to minimize virtual address spaceconsumption. This can be especially desirable for 32-bit systems withlimited virtual address space.

FIG. 7 is a block diagram of an example embodiment of a computer system700 upon which embodiments of the inventive subject matter can execute.The description of FIG. 7 is intended to provide a brief, generaldescription of suitable computer hardware and a suitable computingenvironment in conjunction with which the invention may be implemented.In some embodiments, the inventive subject matter is described in thegeneral context of computer-executable instructions, such as programmodules, being executed by a computer. Generally, program modulesinclude routines, programs, objects, components, data structures, etc.,that perform particular tasks or implement particular abstract datatypes.

As indicated above, the system as disclosed herein can be spread acrossmany physical hosts. Therefore, many systems and sub-systems of FIG. 7can be involved in implementing the inventive subject matter disclosedherein.

Moreover, those skilled in the art will appreciate that the inventionmay be practiced with other computer system configurations, includinghand-held devices, multiprocessor systems, microprocessor-based orprogrammable consumer electronics, smart phones, network PCs,minicomputers, mainframe computers, and the like. Embodiments of theinvention may also be practiced in distributed computer environmentswhere tasks are performed by I/O remote processing devices that arelinked through a communications network. In a distributed computingenvironment, program modules may be located in both local and remotememory storage devices.

The above described systems and methods of sharing memory of executablemodules between guest operating system and host operating system withactive (ASLR) can result in significant improvement in the functioningof a computing system. Whenever a guest operating system and a hostoperating system share a same or similar executable module set, theabove described systems and methods can save large numbers of memoryblocks. Further, in some aspects, disk I/O needed during starting of aguest operating system from a live snapshots and their size on disk canbe reduced.

With reference to FIG. 7, an example embodiment extends to a machine inthe example form of a computer system 700 within which instructions forcausing the machine to perform any one or more of the methodologiesdiscussed herein may be executed. In alternative example embodiments,the machine operates as a standalone device or may be connected (e.g.,networked) to other machines. In a networked deployment, the machine mayoperate in the capacity of a server or a client machine in server-clientnetwork environment, or as a peer machine in a peer-to-peer (ordistributed) network environment. Further, while only a single machineis illustrated, the term “machine” shall also be taken to include anycollection of machines that individually or jointly execute a set (ormultiple sets) of instructions to perform any one or more of themethodologies discussed herein.

The example computer system 700 may include a processor 702 (e.g., acentral processing unit (CPU), a graphics processing unit (GPU) orboth), a main memory 704 and a static memory 706, which communicate witheach other via a bus 708. The computer system 700 may further include avideo display unit 710 (e.g., a liquid crystal display (LCD) or acathode ray tube (CRT)). In example embodiments, the computer system 700also includes one or more of an alpha-numeric input device 712 (e.g., akeyboard), a user interface (UI) navigation device or cursor controldevice 714 (e.g., a mouse), a disk drive unit 716, a signal generationdevice 718 (e.g., a speaker), and a network interface device 720.

The disk drive unit 716 includes a machine-readable medium 722 on whichis stored one or more sets of instructions 724 and data structures(e.g., software instructions) embodying or used by any one or more ofthe methodologies or functions described herein. The instructions 724may also reside, completely or at least partially, within the mainmemory 704 or within the processor 702 during execution thereof by thecomputer system 700, the main memory 704 and the processor 702 alsoconstituting machine-readable media.

While the machine-readable medium 722 is shown in an example embodimentto be a single medium, the term “machine-readable medium” may include asingle medium or multiple media (e.g., a centralized or distributeddatabase, or associated caches and servers) that store the one or moreinstructions. The term “machine-readable medium” shall also be taken toinclude any tangible medium that is capable of storing, encoding, orcarrying instructions for execution by the machine and that cause themachine to perform any one or more of the methodologies of embodimentsof the present invention, or that is capable of storing, encoding, orcarrying data structures used by or associated with such instructions.The term “machine-readable storage medium” shall accordingly be taken toinclude, but not be limited to, solid-state memories and optical andmagnetic media that can store information in a non-transitory manner,i.e., media that is able to store information. Specific examples ofmachine-readable media include non-volatile memory, including by way ofexample semiconductor memory devices (e.g., Erasable ProgrammableRead-Only Memory (EPROM), Electrically Erasable Programmable Read-OnlyMemory (EEPROM), and flash memory devices); magnetic disks such asinternal hard disks and removable disks; magneto-optical disks; andCD-ROM and DVD-ROM disks.

The instructions 724 may further be transmitted or received over acommunications network 726 using a signal transmission medium via thenetwork interface device 720 and utilizing any one of a number ofwell-known transfer protocols (e.g., FTP, HTTP). Examples ofcommunication networks include a local area network (LAN), a wide areanetwork (WAN), the Internet, mobile telephone networks, Plain OldTelephone (POTS) networks, and wireless data networks (e.g., WiFi andWiMax networks). The term “machine-readable signal medium” shall betaken to include any transitory intangible medium that is capable ofstoring, encoding, or carrying instructions for execution by themachine, and includes digital or analog communications signals or otherintangible medium to facilitate communication of such software.

Although an overview of the inventive subject matter has been describedwith reference to specific example embodiments, various modificationsand changes may be made to these embodiments without departing from thebroader spirit and scope of embodiments of the present invention. Suchembodiments of the inventive subject matter may be referred to herein,individually or collectively, by the term “invention” merely forconvenience and without intending to voluntarily limit the scope of thisapplication to any single invention or inventive concept if more thanone is, in fact, disclosed.

As is evident from the foregoing description, certain aspects of theinventive subject matter are not limited by the particular details ofthe examples illustrated herein, and it is therefore contemplated thatother modifications and applications, or equivalents thereof, will occurto those skilled in the art. It is accordingly intended that the claimsshall cover all such modifications and applications that do not departfrom the spirit and scope of the inventive subject matter. Therefore, itis manifestly intended that this inventive subject matter be limitedonly by the following claims and equivalents thereof.

The Abstract is provided to comply with 37 C.F.R. §1.72(b) to allow thereader to quickly ascertain the nature and gist of the technicaldisclosure. The Abstract is submitted with the understanding that itwill not be used to limit the scope of the claims.

What is claimed is:
 1. A method for optimizing a guest virtual machine(VM) snapshot, the method comprising the steps of: loading a pluralityof executable modules in a guest operating system (OS); extractinginformation of the executable modules in the guest OS; creating a memoryhash for pages of guest physical memory of the guest OS; creating amodule hash for pages of the executable modules executing in the guestOS; searching for matches to the pages of guest physical memory and thepages of the executable modules; storing context information associatedwith the matches to the pages of guest physical memory and the pages ofthe executable modules to a snapshot; and modifying the snapshot to linkthe guest physical memory to the pages of the executable modules.
 2. Themethod of claim 1, further comprising the step of determining at leastone of a module file system path and a module default load address foreach of the plurality of executable modules.
 3. The method of claim 1,wherein the memory hash for the pages of guest physical memory islocated in the snapshot.
 4. The method of claim 1, wherein the step ofcreating the module hash for the pages of the executable modulesexecuting in the guest OS includes relocating an executable module as ifthe module is loaded at a default address associated with the moduleprior to determining the module hash.
 5. The method of claim 1, whereinthe step of searching for matches to the pages of guest physical memoryand the pages of the executable modules includes searching for matchesof the memory hash to the module hash.
 6. The method of claim 1, whereinthe context information includes at least one of an assigned moduleunique identifier, a module file path, a module default load addressfrom a portable executable (PE) header, a guest load base address, arelative virtual address (RVA) of at least one of the pages of theexecutable modules, a physical address of at least one of the pages ofthe guest physical memory, relocation information, the memory hash, andthe module hash.
 7. The method of claim 1, wherein the step of modifyingthe snapshot to link the guest physical memory to the pages of theexecutable modules includes storing attributes of the guest physicalmemory for providing a quick location function.
 8. The method of claim1, further comprising the steps of: reading the context information fromthe snapshot; mapping modules containing matching pages from thesnapshot into a host process virtual memory; and relocating the matchingpages for the modules to match a guest load address, the relocatingusing the context information.
 9. The method of claim 8, furthercomprising the step of checking a first hash of the relocated matchingpage with a hash stored in the context information.
 10. A method forloading a guest virtual machine (VM) snapshot, the method comprising thesteps of: determining to extract a plurality of executable modules in aguest operating system (OS); extracting information of the executablemodules in the guest OS; creating a memory hash value for pages of guestphysical memory; creating a module hash value for pages of theexecutable modules executing in the guest OS; searching for matches tothe pages of the guest physical memory and the pages of the executablemodules; storing context information associated with the matches to thepages of the guest physical memory and the pages of the executablemodules to a guest snapshot file; modifying the snapshot file to linkthe guest physical memory to the pages of the executable modules;reading the context information from the snapshot file; mapping modulescontaining matching pages from the snapshot file into a host processvirtual; relocating the matching pages for the modules to match a guestload address, the relocating using the context information; and checkinga first hash value of the relocated matching page with a hash valuestored in the context information.
 11. The method of claim 10, furthercomprising the step of determining at least one of a module file systempath and a module default load address for each of the plurality ofexecutable modules.
 12. The method of claim 10, wherein the memory hashvalue for the pages of the guest physical memory is located in thesnapshot file.
 13. The method of claim 10, wherein the step of creatingthe module hash value for the pages of the executable modules executingin the guest OS includes relocating an executable module as if themodule is loaded at the default address associated with the module priorto determining the module hash value.
 14. The method of claim 10,wherein the step of searching for matches to the pages of guest physicalmemory and the pages of the executable modules includes searching formatches of the memory hash value to the module hash value.
 15. Themethod of claim 10, wherein the context information includes at leastone of an assigned module unique identifier, a module file path, amodule default load address from a portable executable (PE) header, aguest load base address, a relative virtual address (RVA) of at leastone of the pages of the executable modules, a physical address of atleast one of the pages of the guest physical memory, relocationinformation, the memory hash value, and the module hash value.
 16. Themethod of claim 10, wherein the step of modifying the snapshot file tolink the guest physical memory to the pages of the executable modulesincludes storing attributes of the guest physical memory for providing aquick location function.
 17. A non-transitory computer readable storagemedium having a program stored thereon, the program causing a computerto execute the steps of: determining to extract a plurality ofexecutable modules in a guest operating system (OS); extractinginformation of the executable modules in the guest OS; creating a memoryhash value for pages of guest physical memory; creating a module hashvalue for pages of the executable modules executing in the guest OS;searching for matches to the pages of the guest physical memory and thepages of the executable modules; storing context information associatedwith the matches to the pages of the guest physical memory and the pagesof the executable modules to a guest snapshot file; modifying thesnapshot file to link the guest physical memory to the pages of theexecutable modules; reading the context information from the snapshotfile; mapping modules containing matching pages from the snapshot fileinto a host process virtual; relocating the matching pages for themodules to match a guest load address, the relocating using the contextinformation; and checking a first hash value of the relocated matchingpage with a hash value stored in the context information.
 18. Thenon-transitory computer readable storage medium of claim 17, wherein theinstructions further comprise instruction for: determining at least oneof a module file system path and a module default load address for eachof the plurality of executable modules.